{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c5148957",
   "metadata": {},
   "source": [
    "# Skforecast in GPU"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8ce2a3d8",
   "metadata": {},
   "source": [
    "Traditionally, machine learning algorithms have been executed on CPUs (Central Processing Units)—general-purpose processors capable of handling a wide variety of tasks. However, CPUs are not optimized for the highly parallelized matrix operations that many machine learning algorithms rely on, often resulting in slower training times and limited scalability. In contrast, GPUs (Graphics Processing Units) are specifically designed for parallel processing, capable of performing thousands of simultaneous mathematical operations. This makes them particularly well-suited for training and deploying large-scale machine learning models.\n",
    "\n",
    "Several popular machine learning libraries have implemented GPU acceleration, including **XGBoost**, **LightGBM**, **CatBoost** and **CuML**. By leveraging GPU capabilities, these libraries can dramatically reduce training times and enhance scalability. The following sections demonstrate how execute **skforecast** with GPU acceleration to build efficient and powerful forecasting models."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b09bde5",
   "metadata": {},
   "source": [
    "<div class=\"admonition note\" name=\"html-admonition\" style=\"background: rgba(0,184,212,.1); padding-top: 0px; padding-bottom: 6px; border-radius: 8px; border-left: 8px solid #00b8d4; border-color: #00b8d4; padding-left: 10px; padding-right: 10px;\">\n",
    "\n",
    "<p class=\"title\">\n",
    "    <i style=\"font-size: 18px; color:#00b8d4;\"></i>\n",
    "    <b style=\"color: #00b8d4;\">&#9998 Note</b>\n",
    "</p>\n",
    "\n",
    "<p>\n",
    "\n",
    "The performance advantage of using a GPU depends heavily on the specific task and the size of the dataset. Generally, GPU acceleration offers the greatest benefits when working with large datasets and complex models, where its parallel processing capabilities can significantly reduce training times.\n",
    "\n",
    "In recursive forecasting (<code>ForecasterRecursive</code> and <code>ForecasterRecursiveMultiseries</code>) the <b>prediction phase must be executed sequentially since each time step depends on the previous prediction</b>. This inherent dependency prevents parallelization during inference, which explains why model fitting is substantially faster on a GPU, while prediction can actually be slower compared to using a CPU. To overcome this limitation, <b>skforecast automatically switches the estimator to use the CPU for prediction</b>, even if it was trained on a GPU.\n",
    "\n",
    "In contrast, direct forecasters (<code>ForecasterDirect</code>, <code>ForecasterDirectMultivariate</code>) do not rely on previous predictions during inference. This lack of dependency allows both training and prediction to fully benefit from GPU acceleration.\n",
    "\n",
    "</p>\n",
    "\n",
    "</div>\n",
    "\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "4b9fd63a",
   "metadata": {},
   "source": [
    "<div class=\"admonition note\" name=\"html-admonition\" style=\"background: rgba(0,184,212,.1); padding-top: 0px; padding-bottom: 6px; border-radius: 8px; border-left: 8px solid #00b8d4; border-color: #00b8d4; padding-left: 10px; padding-right: 10px;\">\n",
    "\n",
    "<p class=\"title\">\n",
    "    <i style=\"font-size: 18px; color:#00b8d4;\"></i>\n",
    "    <b style=\"color: #00b8d4;\">&#9998 Note</b>\n",
    "</p>\n",
    "\n",
    "<p>\n",
    "\n",
    "Despite the significant advantages offered by GPUs (specifically Nvidia GPUs) in accelerating machine learning computations, access to them is often limited due to high costs or other practical constraints. Fortunatelly, <b>Google Colaboratory (Colab)</b>, a free Jupyter notebook environment, allows users to run Python code in the cloud, with access to powerful hardware resources such as GPUs. This makes it an excellent platform for experimenting with machine learning models, especially those that require intensive computations. The following links provide access to Google Colab notebooks that demonstrate how to use skforecast with GPU acceleration.\n",
    "</p>\n",
    "\n",
    "<ul>\n",
    "    <li><a href=\"https://colab.research.google.com/drive/10PYQFQN9oNkAHh0X7wwyBLQ3JQ_Cm7pP?usp=sharing\">Skforecast in GPU: XGBoost</a></li>\n",
    "    <li><a href=\"https://colab.research.google.com/drive/17Csc70AY-GQA-tvZjq9TYCbmnrNOzslh?usp=sharing\">Skforecast in GPU: LightGBM</a></li>\n",
    "    <li><a href=\"https://colab.research.google.com/drive/1Z-n0kKEnQvY02e9-HxKbkTdLc10RNd_-?usp=sharing\">Skforecast in GPU: CatBoost</a></li>\n",
    "    <li><a href=\"https://colab.research.google.com/drive/1RlHrd7QQ0z4WaA93hrn9zCnHCVC7wbWU?usp=sharing\">Skforecast in GPU: Rapids cuML</a></li>\n",
    "</ul>\n",
    "\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "8ece7fe1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "skforecast version : 0.19.0\n",
      "xgboost version    : 3.1.2\n",
      "lightgbm version   : 4.6.0\n",
      "catboost version   : 1.2.8\n"
     ]
    }
   ],
   "source": [
    "# Libraries\n",
    "# ==============================================================================\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import torch\n",
    "import psutil\n",
    "import xgboost\n",
    "from xgboost import XGBRegressor\n",
    "import lightgbm\n",
    "from lightgbm import LGBMRegressor\n",
    "import catboost\n",
    "from catboost import CatBoostRegressor\n",
    "import warnings\n",
    "\n",
    "import skforecast\n",
    "from skforecast.recursive import ForecasterRecursive\n",
    "from skforecast.model_selection import backtesting_forecaster, TimeSeriesFold\n",
    "\n",
    "print(f\"skforecast version : {skforecast.__version__}\")\n",
    "print(f\"xgboost version    : {xgboost.__version__}\")\n",
    "print(f\"lightgbm version   : {lightgbm.__version__}\")\n",
    "print(f\"catboost version   : {catboost.__version__}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "b7efc087",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using device: cuda\n",
      "Tesla T4\n",
      "Memory Usage:\n",
      "Allocated : 0.0 GB\n",
      "Reserved  : 0.0 GB\n",
      "CPU RAM Free: 11.23 GB\n"
     ]
    }
   ],
   "source": [
    "# Print information abput the GPU and CPU\n",
    "# ==============================================================================\n",
    "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
    "print('Using device:', device)\n",
    "\n",
    "if device.type == 'cuda':\n",
    "    print(torch.cuda.get_device_name(0))\n",
    "    print('Memory Usage:')\n",
    "    print('Allocated :', round(torch.cuda.memory_allocated(0) / 1024**3, 1), 'GB')\n",
    "    print('Reserved  :', round(torch.cuda.memory_reserved(0) / 1024**3, 1), 'GB')\n",
    "\n",
    "print(f\"CPU RAM Free: {psutil.virtual_memory().available / 1024**3:.2f} GB\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "c2db9943",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1990-01-01 00:00:00</th>\n",
       "      <td>0.190776</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1990-01-01 01:00:00</th>\n",
       "      <td>0.121986</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div><br><label><b>dtype:</b> float64</label>"
      ],
      "text/plain": [
       "1990-01-01 00:00:00    0.190776\n",
       "1990-01-01 01:00:00    0.121986\n",
       "Freq: h, Name: y, dtype: float64"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Data\n",
    "# ==============================================================================\n",
    "n = 1_000_000\n",
    "data = pd.Series(\n",
    "    data  = np.random.normal(size=n), \n",
    "    index = pd.date_range(start=\"1990-01-01\", periods=n, freq=\"h\"),\n",
    "    name  = 'y'\n",
    ")\n",
    "data.head(2)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "074ca42f",
   "metadata": {},
   "source": [
    "## XGBoost"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b05bc5e0",
   "metadata": {},
   "source": [
    "To run an XGBoost model (version 2.0 or higher) on a GPU, set the argument device='cuda' during initialization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "516970c7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Suppress warnings\n",
    "# ==============================================================================\n",
    "warnings.filterwarnings(\n",
    "    \"ignore\",\n",
    "    message=\".*Falling back to prediction using DMatrix.*\",\n",
    "    category=UserWarning,\n",
    "    module=\"xgboost\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "895b3ab8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using GPU: 0 days 00:00:25.095668\n",
      "Prediction time using GPU: 0 days 00:00:00.155853\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0abfe5b4fb82494aa7f1284cfcfc9b3b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using GPU: 0 days 00:00:32.672748\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a XGBRegressor using GPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = XGBRegressor(\n",
    "                                 n_estimators = 1000,\n",
    "                                 device       = 'cuda',\n",
    "                                 verbosity    = 1\n",
    "                             ),\n",
    "                 lags = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "cv = TimeSeriesFold(\n",
    "         steps              = 100,\n",
    "         initial_train_size = 990_000,\n",
    "         refit              = False,\n",
    "         verbose            = False\n",
    "     )\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using GPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "357f471b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using CPU: 0 days 00:03:14.851675\n",
      "Prediction time using CPU: 0 days 00:00:00.156767\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fc50e956c1074db7a79eae7f5611e0f6",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using CPU: 0 days 00:02:55.231735\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a XGBRegressor using CPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = XGBRegressor(n_estimators=1000),\n",
    "                 lags      = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using CPU: {elapsed_time}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a27b9011",
   "metadata": {},
   "source": [
    "## LightGBM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "34e6345b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Suppress warnings\n",
    "# ==============================================================================\n",
    "warnings.filterwarnings(\n",
    "    \"ignore\",\n",
    "    message=\"'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.\",\n",
    "    category=FutureWarning,\n",
    "    module=\"sklearn.utils.deprecation\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "600ac583",
   "metadata": {},
   "source": [
    "When using **Google colab**, run the following in a notebook cell to ensure LightGBM can utilize the NVIDIA GPU when executing in google colab.\n",
    "\n",
    "```bash\n",
    "!mkdir -p /etc/OpenCL/vendors && echo \"libnvidia-opencl.so.1\" > /etc/OpenCL/vendors/nvidia.icd\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "da945f75",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using GPU: 0 days 00:01:11.161442\n",
      "Prediction time using GPU: 0 days 00:00:00.079405\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "3a511628253d476ea17fd9afaddebd6f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using GPU: 0 days 00:00:36.365922\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a LGBMRegressor using GPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = LGBMRegressor(n_estimators=1000, device='gpu', verbose=-1),\n",
    "                 lags      = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using GPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "acdbf58a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using CPU: 0 days 00:02:23.478723\n",
      "Prediction time using CPU: 0 days 00:00:00.079490\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fc314730af4046af96a6de17fc0460e2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using CPU: 0 days 00:01:55.754227\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a LGBMRegressor using CPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = LGBMRegressor(n_estimators=1000, device='cpu', verbose=-1),\n",
    "                 lags      = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using CPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ae97578",
   "metadata": {},
   "source": [
    "## CatBoost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "7ff7c2cb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using GPU: 0 days 00:00:18.077499\n",
      "Prediction time using GPU: 0 days 00:00:00.078790\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5a7573880f64443490a13f386219f9f2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using GPU: 0 days 00:00:25.979210\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a CatBoostRegressor using GPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = CatBoostRegressor(n_estimators=1000, task_type='GPU', silent=True, allow_writing_files=False),\n",
    "                 lags      = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using GPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "37ab6127",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using CPU: 0 days 00:03:40.342547\n",
      "Prediction time using CPU: 0 days 00:00:00.077451\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9822a81c405946d1858efdcadd893355",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using CPU: 0 days 00:03:44.589879\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a CatBoostRegressor using CPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = CatBoostRegressor(n_estimators=1000, task_type='CPU', silent=True, allow_writing_files=False),\n",
    "                 lags      = 50\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using CPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62be5f8a",
   "metadata": {},
   "source": [
    "## RAPIDS cuML\n",
    "\n",
    "[**cuML**](https://docs.rapids.ai/api/cuml/stable) is a library for executing machine learning algorithms on GPUs with an API that closely follows the scikit-learn API. \n",
    "\n",
    "To use cuML with skforecast, you need to install the RAPIDS cuML library. The installation process may vary depending on your environment and the version of CUDA you have installed. You can find detailed installation instructions in the [RAPIDS documentation](https://rapids.ai/start.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "062623a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reduce size of data to reduce execution time\n",
    "# ==============================================================================\n",
    "data = data.iloc[:100_000].copy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e8e1bc78",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "skforecast version : 0.19.0\n",
      "cuml version       : 25.10.00\n"
     ]
    }
   ],
   "source": [
    "# Libraries\n",
    "# ==============================================================================\n",
    "import cuml\n",
    "from sklearn.ensemble import RandomForestRegressor\n",
    "\n",
    "print(f\"skforecast version : {skforecast.__version__}\")\n",
    "print(f\"cuml version       : {cuml.__version__}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b83b8d5e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using GPU: 0 days 00:00:02.977651\n",
      "Prediction time using GPU: 0 days 00:00:00.082757\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "dade0563eb1140389970b01d900e80b4",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using GPU: 0 days 00:00:11.755529\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a RandomForestRegressor using GPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                estimator = cuml.ensemble.RandomForestRegressor(\n",
    "                                n_estimators=200,\n",
    "                                max_depth=5,\n",
    "                                output_type='numpy'\n",
    "                            ),\n",
    "                lags = 20\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using GPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "cv = TimeSeriesFold(\n",
    "         steps              = 100,\n",
    "         initial_train_size = 90_000,\n",
    "         refit              = False,\n",
    "         verbose            = False\n",
    "     )\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using GPU: {elapsed_time}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "85b5e2e5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training time using CPU: 0 days 00:04:13.501165\n",
      "Prediction time using CPU: 0 days 00:00:01.354556\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "61373ac928d649bd97ce30c379ad7425",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "  0%|          | 0/100 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Backtesting time using CPU: 0 days 00:06:14.469784\n"
     ]
    }
   ],
   "source": [
    "# Create and train forecaster with a RandomForestRegressor using CPU\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                RandomForestRegressor(n_estimators=200, max_depth=5),\n",
    "                lags = 20\n",
    "             )\n",
    "\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.fit(y=data)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Training time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Predict\n",
    "# ==============================================================================\n",
    "start_time = pd.Timestamp.now()\n",
    "forecaster.predict(steps=100)\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Prediction time using CPU: {elapsed_time}\")\n",
    "\n",
    "# Backtesting\n",
    "# ==============================================================================\n",
    "cv = TimeSeriesFold(\n",
    "         steps              = 100,\n",
    "         initial_train_size = 90_000,\n",
    "         refit              = False,\n",
    "         verbose            = False\n",
    "     )\n",
    "start_time = pd.Timestamp.now()\n",
    "_ = backtesting_forecaster(\n",
    "        forecaster = forecaster,\n",
    "        y          = data,\n",
    "        cv         = cv,\n",
    "        metric     = 'mean_absolute_error'\n",
    "\n",
    "    )\n",
    "elapsed_time = pd.Timestamp.now() - start_time\n",
    "\n",
    "print(f\"Backtesting time using CPU: {elapsed_time}\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
